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Optimal  Control  of  a Brownian  Motion 
by 

Herman  Chernoff 
and 

Albert  John  Petkau 

1 . Introduction 

In  a recent  paper,  Rath  [9]  characterizes  the  solution 
of  an  optimal  stochastic  control  problem  where  the  controller 
can  switch  from  either  one  of  two  modes  to  the  other  and  in 
each  mode,  a diffusion  process  z (t)  evolves  according  to  a 
reflected  Brownian  motion  with  drift  and  diffusion  parameters 
determined  by  the  mode.  In  this  problem,  one  possible 
application  of  which  concerns  the  queue  length  7,  (t)  of 
operations  waiting  to  be  performed  in  a computer,  there  are 
different  costs  per  unit  time  for  each  mode  of  operation, 
there  are  switching  costs  for  changing  modes,  and  there  is 
a linear  holding  cost  per  unit  time,  CQZ(t)  . The  object 
is  to  determine  a policy  which  minimizes  the  long  run  average 
cost  or  more  precisely  the  infinite  horizon  expected  average 
cost.  Rath  demonstrates  that  the  optimal  policy  among  all 
stationary  policies  consists  of  switching  at  two  key  levels 
of  the  process.  The  proof  involves  approximating  the  problem 
by  a sequence  of  discrete  time  discrete  space  random  walk 
problems,  solving  the  latter  and  going  to  the  limit. 

In  this  paper  we  consider  a generalization  of  this 
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Bather  [1,2,3]  and  subsequently  used  by  various  authors  [5,6,7,8,11] 
enables  one  to  work  with  the  diffusion  process  directly. 

This  approach  is  more  analytic  and  thus  has  potential  ad- 
vantages in  adding  general  insights  on  the  behavior  of 
solution  to  the  original  problem  and  its  generalizations. 

It  also  lends  itself  easily  to  numerical  techniques. 

The  major  drawbacks  in  this  approach  are  that  in  com- 
plicated versions  of  the  problem,  e.g.  those  involving  more 
than  two  modes,  the  analytic  approach  becomes  cumbersome.  Finally 
while  it  is  easy  to  show  that  the  candidate  solutions  which  satisfy 
the  optimality  conditions  are  optimal  in  the  class  of  all 
stationary  procedures,  there  is  difficulty,  due  to  lack  of 
compactness  in  demonstrating  that  such  a candidate  is  optimal 
among  all  possibly  non-stationary  procedures. 

2.  The  Model 

Informally,  we  assume  that  there  are  k modes.  At 
any  given  time  we  may  switch  instantaneously  from  mode  i 
to  mode  j at  a cost  of  > 0 . While  we  are  in  mode  i , 

the  cost  is  c.  per  unit  time.  Also  the  queue  length  Z ( t ) 


changes  according  to  reflected  Brownian  motion  with  mean  drift 
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and  variance  per  unit  time.  The  holding  cost  is 

h[Z(t)]  per  unit  time. 

More  formally,  for  i = l,2,***,k  let  W^t)  be  a 
Brownian  motion  with  W^(0)  = 0 , EW.  (t)  = p ^ t , and 
E{ [Wi(t+s)-Wi(s) ]2|Wi(s) } = ai2t  for  all  s,t  > 0 . If 
mode  i.  is  selected  for  the  jt*1  time  period  (t.  , ,t.1 
where  0 = tg  < t^  < • • • , the  basic  diffusion  process 
originating  at  yQ  is 

j'-l 

*(t)  = Y0  + l [W.  (t.)-W.  (t.  )]  + W . (t)-W  (t  ) 

j=l  3 xj  3 1 aj'  Xj*  3 1 


Since  we  wish  to  represent  a queue  length  which  cannot  go 
below  0 , the  description  of  the  current  state 
X ( t)  = (i(t),Z(t))  should  contain  the  level  Z(t)  of  the 
reflected  controlled  process  where 

Z(t)  = Y ( t)  - min(0,Y(s)  ;0  < s < t) 
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t 

f |Ci  + h[Z(t')  ]Jdt' 

s 

is  incurred.  We  shall  assume  that  h(z)  = o(eaz)  for  all 

a > 0 as  z . With  no  loss  of  generality  we  may  assume 

h(0)  = 0 and  K.  . + K.  . > K.  . The  model  with 

11,X2  12,13  11'13 

k = 2 inodes  and  h(z)  = c^z  , Cq  > 0 corresponds  exactly 
to  the  case  considered  by  Rath.  We  shall  restrict  ourselves 
to  the  case  of  2 modes  after  first  discussing  the  relatively 
simple  case  of  k = 1 mode  where  there  is  no  control  problem. 

It  is  desired  to  select  a policy  which  will  minimize 

lim  t_1E |c (x_ , 0 , t)l 

where  C(x,s,t)  represents  the  total  cost  incurred  over  the 
time  interval  (s,t)  when  the  state  at  time  s is  X(s)  = x . 

This  problem  has  a stationary  or  time-homogeneous  character 
which  suggests  that  an  optimal  strategy  should  consist  of  de- 
composing the  set  of  possible  current  states  x = (i,z)  into 
subsets  , j,  = |(i,z):zcc^  of  continuation  states  where 

one  remains  in  mode  i and  C*  . = |(i,z):zrCi  ^ of  switching 

1,3  ( 1,];  2 

states  where  one  switches  from  mode  i to  mode  j , if 
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i ji  j for  i,j  = l,2,***,k  . There  is  some  literature  [4,10] 

which  describes  conditions  under  which  there  are  optimal  policies 
which  are  stationary.  We  shall  confine  our  attention  to 
such  policies  and  later  comment  briefly  on  the  more  general 
question  when  the  "good"  policies  we  select  are  optimal  among 
all  policies  according  to  our  long  run  expected  average  cost 
criterion. 

The  Rath  solution  for  the  linear  cost  function  consists 
of  (a,b)  switching  policies  with  0<a<b <_  °°  where  one  switches 
from  i = 1 to  i = 2 if  z _>  b and  one  switches  from  i = 2 to  i = 1 
if  z <_  a . Thus  = [0,b),  C12  = [b,°°)  , C21  = [0,a] 

and  C22  = (a,°°)  . For  simplicity  we  shall  confine  our 

attention  to  those  stationary  policies  where  (1)  each 
consists  of  a finite  number  of  non-degenerate  intervals; 

(2)  is  an  open  subset  of  [0,°°)  (0  is  regarded  as  an 

inner  point);  and  (3)  C Cjj  . We  shall  call  such 

stationary  policies  regular . 

3.  The  Potential  Function 

The  basic  advantage  of  dealing  with  stationary  policies 
is  that  the  state  X(t)  becomes  a Markov  Process  when  such 
a policy  is  applied.  The  stationary  distribution  which  de- 
rives from  such  Markov  Processes  provides  an  alternative  basis 
for  the  proof  of  the  existence  of  the  analytic  tool,  the  potential 
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function,  which  we  introduce  under  the  heuristic  assumption 
that  for  any  stationary  policy  with  finite  long  run  expected 
average  cost  y 


i 

I 


(3.1)  E{C(x,t,T)}  = y (T-t)  + v(x)  + o (1)  as  T 

Then  the  function  v(x)  = v(i,z)  = v^(z)  , the  potential 
function  provides  the  relative  disadvantage  of  the  initial 
state  x = (i,z)  compared  with  any  other  state  x'  = (i',z')  . 
(In  this  paper  we  shall  leave  v determined  up  to  an  unknown 
constant  which  will  not  be  required  and  whose  calculation  is 
more  difficult  than  the  analysis  we  require.) 

Suppose  now  that  z ^ 0 is  a point  of  . Then  the 

following  backward  induction  argument  demonstrates  that 

(3.2)  (z)  + j ai 2vV  (z)  + ci  + h(z)  = y for  zeCi;L  , z ? 0 
The  argument  is  that 

E{C(  (i,z)  , t-dt,T)  ) } =*  [ci+h(z)  ]dt  + E{C(  (i,z+d/.)  ,t,T)  ) } + o(dt) 
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where  dZ(t)  = dV"T(t)  has  mean  v^dt  and  variance  cn  dt  . 
Substituting  in  (3.1),  expanding  v^(z+dZ)  about  z , and 
neglecting  terms  of  order  o(dt)  , Equation  (3.2)  follows. 
Equation  (3.2)  may  be  interpreted  as  expressing  the  overall 
cost  rate  y as  the  sum  of  the  current  cost  rate  c^  + h(z) 
plus  one  due  to  the  expected  movement  of  the  diffusion  process. 

If  z = OeC.^  , the  reflected  nature  of  the  process 
leads  to 

(3.3)  v'  (0)  =0  for  0 e C. . 

i li 


1/2 

since  dZ  = Op(dt)  . For  switching  states  we  have 


(3.4)  v^(z)  = + Vj  (z) 


for  zeC . . , j t i 
13  J 


Finally  the  condition  h(z)  = o (eaz)  as  z « for 
a > 0 implies 


cl  Z 

(3.5)  v^z)  = o(e  ) as  z -*■  <»  for  a > 0 , i = 1,2, •••k 

so  long  as  the  policy  does  not  permit  z to  drift  off  to  °° 

The  heuristic  introduction  (3.1)  to  the  potential  function 
can  be  replaced  by  a more  precise  and  rigorous  result  which 
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can  be  expressed  in  terms  of  the  cost  D of  a "game"  starting 
in  state  x at  time  t and  terminating  at  time  T > t with 
a terminal  cost  of  v[X(T)]  . This  is  presented  below  as 

follows : 

If  v satisfies  (3.2)  - (3.5)  for  a regular  policy,  then 


3.6)  D ( x , t , T)  = E{C(x,t,T)  } + E{v[X (T)  ] I X (t) =x)  = Y (T-t)  + v(x) 


can  be  established  by  a "backward  induction"  argument  on  t . 


4.  Solutions  of  the  differential  equation  and  applications. 


It  is  instructive  to  see  what  (3.2),  (3.3),  and  (3.5) 

imply  in  the  one  mode  case  where  there  is  no  optimal  control 
problem.  Here  we  may  as  well  drop  the  subscript  i . A 
solution  f(z)  of  the  differential  equation  (3.2)  has  the  form 


(4.1)  f ' (z)  = ae"2  (li/°2)  z + in£  ll-e-2(l,/a2)z] 


- ~ J h ( w)  e” 


2 ( |i/o  ) (z-w) 


2 J 
a 0 


dw  for  n ^ 0 


If  y > 0 , our  process  drifts  off  to  n>  and  for  h(z)  *•  «» 
as  z *•  <*>  , we  would  have  y = •*>  , and  this  solution  of  the 

differential  equation  would  be  irrelevant  for  the  one  mode 


- 8 - 


problem.  On  the  other  hand  it  will  be  useful  later  for  the 
more  general  control  problem,  as  will  be  the  solution  of  (3.2) 


for  the  case  p = 0 . If  p < 0 an  alternate  representation 
of  the  solution  of  (3.2)  is 


(4.2)  f 1 ( z ) 


* - 2 ( p /o ' 
a e ' 


) z ^ y-c 


/ h(„)e-2<l,'o2><2-“>dw 
z 


for  P < 0 . 


Finally  if  p = 0 , the  solution  of  (3.2)  is 

(4.3)  f ’ (z)  = a + —IT.?)  z - f h(w)dw  for  p = 0 . 

a o •A) 

The  condition  (3.3)  leads  to  a = 0 in  (4.1) . On  the 

★ 

other  hand  if  p < 0 and  (3.5)  applies,  we  have  a =0 
in  (4.2).  These  two  "boundary"  conditions  imply 

(4.4) 
where 

(4.5) 


Y = c + I 


9/  h 


Me2Ma  )wdw 


p < 0 
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In  the  special  case  where  h(z)  = cQz  , the  expressions 
in  equation  (4.1)  - (4.5)  are  easily  evaluated.  Thus,  in  the 
one  mode  case  y = °°  for  p > 0 and  y = c + I for 
p < 0 where 


(4.5)  ' 


* 2 

I = I = -co°  /2V*  for  P < 0 . 


The  solution  of  (3.2)  may  be  expressed  by 


(4.1)  ' 


f ' (z)  = ~ [y-  (c+I  *)  ] - + ae~2{u/°2)z 


■ f y / o 


(4.3)  ' 


,_4  **  , 2 (y-c)  ^0  2 

f ' (z ) = a + — 1 2 z - — j z 

a a 


if  p = 0 


Note  that  the  condition  f ' ( 0 ) =0  in  (4.1)'  implies 
a = - [y-(c+I  ) ]/p  . Thus  in  the  one  mode  case  with  p < 0 , 


(4.4)  ' 


Y = c - CqCT  /2p  , 


f'(z)  = - c0z/p  and  except  for  an  unknown  constant  v(z)  = - cQz  /2p 
To  illustrate  the  evaluation  of  u and  I ' for  a two  mode 

l 

problem,  suppose  that  C11=(a1,a2)  and  C22  = [0  ) U (b2  ,->) 
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where  0 < aj^  < bj  < < a2  and  p2  < 0 . Then 

and  v2  involve  4 unknown  constants.  These  are 
Y and  a2^  (a  for  mode  2 with  z <_  b^)  , a22  (a  for 

mode  2 with  z _>  b2)  and  (a  for  mode  1 with  a^  «'  z < a2) 

On  the  other  hand  v2(0)  = 0 implies  a21  = 0 , and 

az  y~c2  12  * 

v2 (z)  = o (e  ) implies  a22  - — + — = 0 (i.e.,  a22  =0). 

Furthermore,  the  conditions  v^(a^)  = v2(a^)  + K^2  and 

v (b . ) = v (b.)  + K_.  , imply  that 

2 1 11 

rb ) 

J [v^(z)-vMz)  ]dz  = K2^  + k^2  for  j =1»2  which  imposes 

cl  . 

1 

two  more  conditions  on  the  four  constants.  These  lead  to 

the  determination  of  y and  v!(z)  on  the  continuation  sets, 

C-.  . On  the  switching  states  C--  , v.(z)  = v.(z)  + K.. 

li  2 id  i j i] 

From  all  of  this  calculus,  v is  determined  up  to  an  unknown 
additive  constant,  the  determination  of  which,  even  in  the  one 
mode  problem,  requires  analysis  of  the  stationary  distribution 
of  the  Markov  Process  governing  the  state  (i,z)  as  a function 
of  time. 


5.  Optimality  Conditions 


Two  very  natural  optimality  conditions  on  the  policy  are 


(5.1) 


vi  (z)  _<  Kij  + Vj  (z) 


for  zeC . . 

ii 
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o? 


a? 


(5.2)  c.+  u^v!  (z)  + — v V ( z ) _<  c . + y . v ! (z)  + — v"  (z) 

J D D ^ J 1 ^ J ^ D 


for  z p-^(qj) 


where  v/y  represents  the  interior  and  the  latter  condition 

derives  from  considering  the  consequence  of  staying  in  mode 

i for  a short  period  of  time  while  ze^(Ct  .)  . During  this 

period  v.(z)  = K..  + v.(z)  and  hence  v!(z)  = v!(z)  and 
1 13  j 13 

vV(z)  = vV(z)  . Related  to  these  conditions  is  the  smoothness 
condition  that  the  right  hand  and  left  hand  derivatives  of 
the  potential  function  are  equal  on  '^3  ^ ci j ^ t^*e  boundary  of 
C. . . If  we  label  this  common  derivative  v! (z)  and  again 
use  the  fact  that  v. (z)  = K. . + v. (z)  on  C. . we  may  write 

1 13  3 13  * 


(5.3) 


v!(z)  = vj  (z) 


for 


z e 


Applying  these  optimality  conditions  with  a backward  induction 
argument  yields  part  of  a sufficiency  result.  Details  of  such  an  argu- 
ment appear  in  [7],  To  be  more  explicit,  suppose  that  (PQ  is  a 
regular  stationary  policy  and  vQ  and  yQ  satisfy  the  equations 

(3.2)  to  (3.5)  and  (5.1)  to  (5.3).  Then  for  any  alternative 
measurable  policy  (P  (not  necessarily  stationary) 

(5.4)  D (x, t , T)  = E(C(x,t,T) } + E{vQ[X(T) ] |x(t)  = x) ^ yQ (T-t)  + vQ (x) 
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(where  E represents  expectation  with  respect  to  policy  <P  ) 
with  equality  for  (p  = (Pq  . It  is  clear  that  if  conditions 
are  such  that  the  second  term  on  the  left  of  (5.4)  must  be 
o (T-t)  then  is  optimal  and  Yq  is  the  optimal  long 

run  average  cost.  It  is  somewhat  peculiar  that  the  main 
obstacle  in  establishing  the  optimality  of  (Pq  lies  in  the 
possibility  of  a better  non-stationary  (P  for  which  vQ(x(T)) 
may  occasionally  be  very  large,  a possibility  that  is  intuitively 
associated  only  with  poor  policies. 

If  and  u2  are  both  negative,  it  is  clear  that 

E{ vQ [X (T) ] | X (T) =x}  = 0 (1)  and  a candidate  stationary  policy 
(Pq  satisfying  the  optimality  conditions  5.1  to  5.3  will 
be  optimal.  If  > 0 it  is  easy  to  see  that  (Pq 

is  optimal  among  the  class  of  regular  stationary  alternatives. 

For  a regular  alternative  which  uses  only  mode  2 when  Z(t) 
is  large,  E{Vq [X (T)  ) X (t) =x}  = 0(1)  . On  the  other  hand, 

for  a regular  alternative  which  allows  one  to  stay  in  mode 
1 when  Z ( t ) is  large,  y = °°  . 

To  establish  optimality  among  all  measurable  alternatives 
when  > 0 is  more  difficult.  In  Section  8 a proof  is 

outlined  for  the  linear  holding  cost  case.  That  proof  can 
be  generalized  somewhat  but  a clear  understanding  seems  to 
require  a different  approach. 


V v '4-  ■, 
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6 . The  case  of  one  transient  mode 

Our  general  approach  will  be  to  consider  some  simple 

2 2 

policies  and  to  catalog  those  parameters  ^i'^2'0l  i°2  ,cl,c 2' 
K and  K21  for  which  such  policies  satisfy  the  optimality 
conditions.  We  shall  pay  special  attention  to  the  conditions 

(5.1)  and  (5.3) . 

To  avoid  undue  attention  to  fussy  details  we  shall  assume 
that  at  least  one  mean,  say  u2  • *s  negative,  that  h(z)  ^ 0 
for  z > 0 and  that  for  large  z , h(z)  is  large  enough 
to  make  the  continued  use  of  a mode  with  positive  mean  pro- 
hibitive. We  shall  be  more  explicit  about  this  last  condition 
shortly. 

Let 

(6.1)  Si  - ci  + 

where  1^  is  the  weighted  average  of  h , 

2 

-2  y.  />ou  2 (y./cr.  )w 

(6.2)  Ii  = --  -y  Je  h (w)  dw  , if  yi  < 0 . 

1 0 


Then  6^  is  the  long  run  average  cost  corresponding  to  the 
exclusive  use  of  mode  i if  y^  < 0 . If  ^ > 0 i the 
long  run  average  cost  would  be  at  least  c^  + lim  inf 2 ^ m h(z)  . 
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E ’ 


■( ■ ' 


Thus  combining  all  of  our  conditions,  we  shall  hereafter 
assume 


Ho : 

y2  ' 

0 

H1 : 

h (0) 

= 0 and  h(z)  > 0 

for  z 

> 0 

H2: 

h (z) 

= o (e  ) as  z -*■  °° 

for  a 

> 0 

H3: 

h ( z ) 

is  continuous 

H4  * 

lim 

inf z ^ ^ h (z)  > max(I2,I2 

+ c2-c1) 

Assumption  H4  states  that  the  average  holding  cost  in  mode  2 
is  less  than  that  for  large  z . Further  if  > 0 , using 

mode  1 would  lead  to  an  average  cost  which  would  exceed 
c2  + l2  , that  of  using  mode  2 exclusively.  If  both  ^ 
and  ^2  are  negative  we  shall  find  it  convenient  to  relabel 
the  subscripts  so  that 

-2  -2 

H j : 1^  > I2a2  if  ulf  u2  < 0 , 


assuming  for  minor  convenience  that  equality  does  not  obtain. 
Note  that  under  these  assumptions,  1^  ^ 0 for  < 0 

and  that  in  principle  we  allow  c^  < 0 . 

The  simplest  policy  is  that  where  one  mode  is  transient . 

By  this  we  mean  that  the  policy  is  such  that  after  shifting 
from  that  mode,  it  is  never  revisited.  In  particular  we  shall 
assume  that  mode  1 is  transient  and  hence  y = = c2  + I2  * 

Theorem  6.1.  The  optimality  conditions  (5.1)  and  (5.3)  are 
satisfied  for  a transient  policy  with  = {0,b)  , and 


15  - 


cx  - c2  < I2(l-o1  /a 2 ) 


C1  " c2  > I2  “ II  (i*e* » > &2)  — < 0 


K - K12  + K21  > L 


2 2 


for  some  positive  L depending  on  p^,p2#a^  ,a2  , c^,c2  and 

The  optimality  condition  (5.2)  is  satisfied  for  the  linear 


holding  cost  h(z)  = c^z  . 


Proof : We  shall  defer  the  proof  of  the  last  sentence  till 

the  next  section,  and  we  shall  first  treat 

Case  A:  p^  > 0 

Let  g(z)  = f^(z)  - f2(z)  where  fj  and  f2  are  the 
special  solutions  of  (3.2)  represented  by  f ^ ( 0 ) = f2(0)  = 


(6.6)  fi  (z)  = -^r  f l/ol2)  (W"Z)  [Y-c,-h(w)  ]« 


a 


and  if  y^  ^ 0 


(6.8)  g'  (z)  = fj(z)  - f'2  (z) 


Y-c.  -2  (y . /a  2)  z 

= i-  (1-e  1 1 ’ 


? aiuj/^2)  (w-z)h(w)d„ 

0lH 


Y_c 2 2 /*“  2(y2//a2  > (w_z) 

~ ' 7^  X e 


2 a2  -'z 


h (w)  dw 


If  y^  = 0 , the  first  term  on  the  right  must  be  replaced  by 
2 

2z(y_c^)/o^  . Since  g'(0)  = 0 and  (6.3)  implies 
g " ( 0 ) = 2 [o12  (y-c^^) -o22i2]  > 0 , g'(z)  is  positive  for 


small  positive  z . On  the  other  hand  H4  implies  that 
as  z <*>  , f|  (z)  becomes  negative  and  f2(z)  becomes 


We  shall  now  show  that  the  potential  function  for  c^j  = [0,b) 
and  C12  = [b,°°)  will  satisfy  conditions  (5.1)  and  (5.3).  Let 
v 2 ( z ) = f2(z)  + ^2  ' and  let  v]_(z)  = fx(z)  + ^ for 

0 < z < b and  v^(z)  = v2(z)  + K^2  for  z > b . Then  v is  our 
potential  function  (except  for  an  additive  constant)  provided 
v^ (b)  = v2(b)  + Kl2  or  £2  - = L - K^2  . Since 

g’(z)  > 0 for  0 < z < b , vi  ” v2  attains  its  minimum 

of  - 42  = Ki2  “ L at  2 = 0 , (6.5)  implies  (5.1)  . 

Since  g'(b)  = 0 , the  smoothness  condition  (5.3)  is  satisfied. 

2 2 

Note  that  if  we  increase  c^  so  that  ci  ~ c2  I2^1~°l  / °2  ^ 
g'(z)  decreases  and  the  corresponding  values  of  b and  L 
approach  zero  monotonically . 

Case  B:  p2  < 0 , < 0 . 

While  the  sign  of  g"(0)  is  determined  to  be  positive 
with  the  same  argument  as  in  Case  A,  the  sign  of  g'(z)  for 
large  z is  that  of  (f^-y)/^  = (B^B^/p-^  since 

y-c  -2(p./a  2)z  I -2(p  /a.  2)  z 

f.'(z)  = [1-e  11  ] + — e 1 i 

1 yl  U1 

- ™ 2 (p./o  2)  (w-z) 

+ — j I e h(w)dw  . 

°1  z 


large  z . The  remainder  of  the  argument  in  Case  A applies 
equally  well  to  Case  B. 


j) 

I 


Thus  as  we  promised  we  have  proved  all  but  the  last 
sentence  of  Theorem  6.1.  D 

It  follows  from  this  proof  that  as  increases  so  that 

2 2 

C1  ” c2  approaches  I2^_01  /°2  ^ decreases  and 

the  corresponding  values  of  b and  L approach  zero  mono- 
tonically.  Thus  one  would  anticipate  that  larger  values  of 
c^  would  yield  optimal  policies  with  C null.  In  Case  B 
where  < 0 , as  c^  decreases  so  that  c^  - C2  approaches 

Ij  - Ii  1 b and  L increase  monotonically . (The  limiting 
value  of  b may  be  + ® . This  is  the  case  for  linear  holding 
cost  as  is  easily  derived  from  the  analysis  that  follows  shortly.) 
As  c1  decreases  below  this  level,  a transient  policy  to  be 
optimal  clearly  would  have  to  use  mode  2 as  the  transient 
mode.  In  the  next  section  we  shall  show  that  this  is  not  the 
case,  i.e.,  that  the  optimal  policy  has  no  transient  mode  when 


C1  + < °2  + I2 


The  particular  values  of  and  k2j_  which  yield  a 

given  sum  K have  no  influence  on  y for  a non-transient 
two  mode  policy.  This  is  more  or  less  obvious  since  any  such 
policy  which  involves  more  than  one  switch  pays  K for  every 
pair  of  switches.  For  policies  with  a transient  mode,  in- 
creasing K does  not  affect  y . On  the  other  hand,  decreasing 
K below  L leads  to  the  possible  optimality  of  non-transient 
policies . 
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f'2(z)  = I-coz+(y-02)  )/m2 

Thus  g"  is  strictly  monotone  decreasing  in  both  cases  A and  B 
and  hence  the  positive  solution  b of  g'(z)  =0  is  unique. 
Moreover  the  conditions  (6.3),  (6.4)  and  (H.5)  translate  to 

(6.3’) 

(6.4’) 
and 

(H.5')  y2  < < 0 . 

* 

The  term  Y - p 2 is  zero  here  and  is  inserted  as  a covenience 

for  reference  in  the  more  general  case  of  Section  7 where  it  appears 
as  a negative  quantity. 


2 2 

c^  — c 2 + Cg (o2  — u ^ )/2y £ ^ 0 


C0  °2 
ci  - c2 + -r  <t: 


-)  > 0 


if 


< 0 


20 


f . Two  non-transient  modes 

The  next  simplest  type  of  policy  to  consider  is  that 
where  = [0,b)  and  c22  = (a'“>  where  0 < a < b . 

For  such  a policy  neither  mode  is  transient  and  both  modes 
will  recur  infinitely  often  in  the  long  run.  We  shall  treat 
three  cases,  including  two  in  parallel  with  those  of 
Section  6.  We  shall  determine  conditions  under  which  such  policies 
satisfy  the  optimality  conditions  (5.1)  and  (5.3)  and  show  that 
(5.2)  is  also  satisfied  in  the  case  of  linear  holding  cost.  These 
results  provide  a complete  classification  of  optimal  policies 
in  the  linear  case. 

Theorem  7.1  The  optimality  conditions  (5.1)  and  (5.3)  are 
satisfied  for  a policy  with  = [0,b)  and  C22  = (a,°°)  with 

0 < a < b for  appropriate  y < B2  in  the  following  cases. 


Case 

_A: 

yl 

> 

0 

and  0 

< 

K = K12 

+ K21 

< L 

Case 

B: 

yl 

< 

0 , 

crc2 

> 

I2_I1  ' 

and 

0 < K < 

K12  + 

K21  < L 

Case 

_C : 

yl 

< 

0 , 

crc2 

< 

V1!  ' 

and 

0 < K < 

K1 2 + 

K21  < L0 

2 

for  some  appropriate  depending  on  c^  , c2,  y2, 

2 

and  a2  . If  the  holding  cost  is  linear,  h(z)  = CqZ  , lq  = °° 
and  the  optimality  condition  (5.2)  is  satisfied  in  all  three  cases. 

Proof:  We  shall  treat  the  first  two  cases  in  parallel 

with  those  of  Section  6.  Our  proof  of  the  last  sentence  will 
be  adequate  to  cover  the  last  sentence  of  Theorem  6.1.  In 
Section  6 we  took  the  precaution  of  using  y in  place  of 
&2  in  our  formulae  even  though  they  were  equal.  As  a result 


equations  (6.6)  to  (6.8)  may  still  be  applied. 


and  that  3g' (z)/3y  > 0 . If  we  regard  g' (z)  as  a 

function  of  z and  y , say  g^(z,y)  then  we  are  concerned 

with  how  the  values  of  z for  which  g^(z,y)  = 0 change  as 

y decreases  from  y = &2  • At  y = ^ these  are  the 

values  0 and  b of  the  cases  treated  in  Theorem  6.1.  As 

y decreases,  one  root  z^(y)  is  monotonically  increasing 

from  0 and  the  next  positive  root  z^(y)  is  monotonically 

decreasing  from  b until  they  meet  at  a common  value  zQ 

corresponding  to  a value  yQ  of  y and  such  that 

gi(z0,y0)  “ 9gj^  (zQ,  Yq) /3z  = 0 . These  roots  z^fy)  and 

z2(y)  represent  values  of  a and  b for  which  the 

optimality  conditions  (5.1)  and  (5.3)  are  satisfied  with 

rz  2 

K12  and  K21  values  for  which  K = K12  + K21  = l g1(z,y)dz 

J zl 

is  monotonically  decreasing  from  L to  0 as  y decreases 
from  02  to  y q . The  case  where  z.^  = z2  - zQ  and 
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Y = Yq  corresponds  to  the  limiting  case  of  zero  switching 
costs.  This  disposes  of  Cases  A and  B of  Theorem  7.1 
Now  let  us  treat 

Case  C:  < 0 , ci  “ c2  < 12  ~ I1 

First  let  c^  decrease  in  Case  B so  that  ci~c2  * I2-Il 
or  equivalently  8^  decreases  toward  B2  • Then,  in  the 
transient  case  where  y = S2  ~ c2  + I2  ' g' (z)  increases 
monotonically  to  a limiting  function  g^(z)  which  vanishes 
at  z = 0 and  is  positive  for  positive  z close  to  0 . 

The  corresponding  values  of  b and  L increase  monotonically 

* it 

to  possibly  infinite  limits  b and  L 

As  c1  passes  below  I2  ~ 1l  + c2  ' (Case  c) / 6^  de~ 

creases  below  $2  and  policies  where  mode  1 is  transient 

can  no  longer  be  optimal.  Moreover  Y < < B2  anc^  &2  “ Y 

is  small.  Then  g'(0)  = (B0-y)/u0  < 0 and  lirc  g'(z)  = - °°  . 

But  for  moderately  small  positive  values  of  z , g'(z)  is 

sufficiently  close  to  the  positive  limit  of  the  transient  case 

mentioned  above  that  g'(z)  will  be  positive  for  some  positive 

z . Thus  g'(z)  has  at  least  two  positive  roots,  one  of  which 

is  close  to  zero  and  the  first  two  of  which  can  be  labeled  a 

and  b and  correspond  to  the  optimality  conditions  (5.1)  and 

b 

(5.3)  for  some  K * / g' (z)dz  . As  c.  and  y ~ c.  decrease, 

a 1 

g'(z)  decreases  and  K,  a,  and  b will  behave  monotonically 

until  K reaches  0 and  a and  b come  together  as  in  the 

discussion  of  cases  A and  B.  For  fixed  c^,  as  y increases 

to  Bj^,  K increases  to  some  limit  LQ  depending  on  Cj^ , c2. 
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The  value  of  Ln  depends  upon  g'(z)  which  can  be  written 


(7.2)  9q  (z)  = p'1{l1-I1(z)  } - vi21^I2"I2(z)  } 


where 


-2p  . °°  2 (p  ./a . ) (w-z) 


V2>  = ~2^  / e 


l l 


h(w)dw  , i=l,2, 


o. 

l 


are  exponentially  weighted  averages  of  h(w)  for  w > z . Let 
us  now  consider  the  case  where  c^  is  decreased  by  6 
from  ^2  ~ + C2  anC^  Y ~ stays  fixed  at  _ C1  ’ 


Then 


g'(z)  = + <5/^2  and  for  <5  sufficiently  small 


g'  (z)  = 0 has  a root  aQ  > 0 close  to  0 . Let  bQ 


be  the 


second  positive  root  if  there  is  one  and  oo  otherwise.  Then 
bQ 

L.  = / g'(z)dz  . If  g ■ (z)  is  strictly  monotonically  increasing, 

u a 0 

o 


L0  " 


and  b = 00 
o 


for  all 


Otherwise  Lq  will 


vanish  for  c^  sufficiently  small.  If  the  holding  cost  is 
linear,  h(z)  = cQz  , then  gg(z)  = cQz  (w ) and  1 j = °° 
for  all  c^  < I2-Ii  + c2  * 

It  remains  only  to  prove  that  the  optimality  condition  C>.2) 
is  satisfied  in  all  three  cases  if  h(z)  = cQz  . Our  proof 
will  also  apply  to  Theorem  6.1.  We  begin  with  some  general 
considerations  and  specialize  to  the  linear  case.  Let  us  apply 
condition  (5.2)  to  the  policy  = [0,b)  and  C22  = (a»°“) 


2'**/  — *•  2 ' “ / '"12  a^d  vj  (z)  = f{(z)  on  ^"21 


in  which  case  v^fz)  = f^(z)  on  C 
Both  f£(z)  and  f^(z)  satisfy  (3.2)  and  hence 
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Substituting  in  (5.2)  we  have 


(7.3) 

u-, g ' (z) 

g"  (z) 

I 0 

for  z > b 

1 

and 

2 

(7.3') 

y 2g ' (z) 

+ 

2 

g"  (Z) 

> 0 

for  z < a 

where  (7. 

3') 

disappears  in  the 

transient 

case  of  Section  6. 

If  g' (z) 

< 

0 and 

g " ( z ) 

> 0 for  z < 

a 

(7.3')  applies. 

If  g'(z) 

< 

0 and 

g " ( z ) 

< 0 for  z > 

b 

and  > 0 (7.3) 

applies. 

We  now 

proceed 

to  the 

special  case 

of 

h(z)  = cQz  to 

show  that 

both  (7.3) 

and  ( 

7.3’) 

apply. 

As 

we  noted  in  Section 

6 , g"(z)  is  monotone  decreasing  in  both  cases  A and  B . 

It  is  easily  seen  to  be  monotone  in  Case  C also  and  her.ce 

must  be  monotone  decreasing  with  a root  between  a and  b . 

Hence  (7.3')  applies  with  strict  inequality.  On  the  other 
2 

hand  u-^g'  + g"/2  is  linear  in  z with  slope 

Cq  ( u 2 ^ /P2  < ® and  hence  it  suffices  to  establish  (7.3) 
for  z = b . But  g'(b)  = 0 and  g” (b)  < 0 and  thus  (7.3) 
applies  with  strict  inequality. 
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Let  us  review  the  current  status.  For  the  linear  holding 


cost  we  have  shown  that  the  simple  policies  of  Sections  6 

and  7 apply  to  a large  set  of  possible  parameter  values. 

What  have  we  omitted?  To  minimize  discussion  we  have  avoided 

"boundary"  cases  where  a or  b or  K are  zero  or  where 

= 62  when  y^  < 0 or  when  y^  = y 2 , but  these  cases 

are  not  particularly  deep.  The  case  y^  < u 2 < 0 was  covered 

in  the  case  y2  < < 0 by  interchanging  subscripts.  (In 

the  non-linear  case,  the  condition  y^  < y-^  < 0 is  replaced 

by  (H.5)).  Considering  the  fact  that  as  c^  increases  to 
2 2 

c2  + ^(l-o^  /a2  ) , b and  L 0 in  the  transient  case, 

if  one  could  establish  the  optimality,  i.e.,  the  sufficiency 

of  the  optimality  conditions,  one  would  be  led  to  the  con- 

2 2 

elusion  that  when  c^  c2  + I2(l-a2  /o2  ) an  optimal  policy 
requires  a null  . Thus  we  have  shown  that  the  policies 

of  Sections  6 and  7 are  the  class  of  optimal  policies  for  all 
parameter  values  involving  linear  holding  costs,  provided 
we  can  establish  the  sufficiency  of  the  optimality  conditions 
(5.1)  to  (5.3).  This  is  accomplished  in  Section  8. 

What  is  the  situation  for  the  non-linear  case?  If 
g^fz)  is  not  monotone  increasing,  we  must  seek  more  complex 
policies  for  some  parameter  values.  If  g^fz)  str"ictly 

monotone  increasing,  it  is  possible  that  the  simple  policies 
will  suffice  if  we  could  establish  (7.3)  and  (7.3').  In 
Section  9 we  present  an  example  involving  a non-monotone 
holding  cost  where  an  optimal  policy  requires  C22  = ( 0 ,b)U  (b2 ,°°) 
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while  C11  = (a^a^  where  0 < a^  < < fc>2  < . One  may 

conjecture  that  the  simple  "two  point"  policies  of  Sections  6 

and  7 contain  all  optimal  policies  if  h(z)  is  monotone  and 

approaches  infinity  as  z -*•  00  . However  that  conjecture 

is  not  valid  in  general.  Computations  were  carried  out  for 

2 

the  case  where  y^  = -.2,  = 0.1,  c^  = 0.23,  y2  = 

2 

o 2 = 1.0,  c2  = 0.0,  and  y = 0.46  while  h(z)  is  con- 

structed with  3 line  segments  which  have  slope  1 for  0 _<  z £ 0 . 8 , 
0.1  for  0.8  _<  z <_  2.0  and  20.0  for  z _>  2 . 0 . Then 
g'(z)  < 0 except  in  the  interval  (0.04,0.85)  but 
y^g'(z)  + a^2g"(z)/2  is  positive  in  (1.722,2.025).  Thus 
condition  7.2  fails  for  the  only  candidate  for  a "two-point" 

optimal  policy.  Note  that  in  this  example  o”2I1  = 2.425, 

-2 

o2  I2  = 0.591,  and  0.23  = c-^^  < = 0.349.  Thus 

we  are  in  Case  C but  gg(z)  is  not  monotonic.  It  attains 
a local  maximum  of  1.967  at  z = 0.767  but  starts  to  increase 
again  after  z = 1.137. 

8 . Sufficiency  of  the  Optimality  Conditions 

In  Section  5 we  found  it  relatively  easy  to  establish  the 
optimality  of  a candidate  stationary  policy  satisfying 

the  optimality  conditions  (5.1)  to  (5.3)  with  y = y Q and 
v = v0  when  this  policy  is  compared  with  other  regular 
stationary  policies  and,  in  the  case  where  < 0 , when  it  is 

compared  with  all  measurable  policies.  If  y >_  0 , another 
proof  is  required  to  establish  optimality  in  the  class  of  all 
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measurable  policies.  We  shall  outline  such  a proof  for  the 
special  case  of  linear  holding  cost.  Since  the  proof  is 
clumsy  and  seems  to  have  limited  prospects  for  generalization 
ar.d  does  not  appear  to  confront  the  main  issues  and  conditions 
which  should  be  illuminated  by  an  insightful  proof  our's 
will  be  informal  and  sketchy. 

The  main  points  of  our  proof  consist  of  showing  first 
that  given  an  arbitrary  policy  one  can  do  almost  as  well 
over  a long  time  period  T with  a policy  that  applies  mode 
2 for  a substantial  time  period  whenever  Z(t)  exceeds  T 
for  some  r between  .5  and  1 . Second,  such  a restricted 
policy  can  do  better  than  over  (0,T)  only  if 

T ^EVq [X (T) ] is  not  small,  in  which  case  the  expected  holding 

y i f 

cost  over  the  interval  (T-T  ,T)  with  r + 6 < 1 is  so 

substantial  that  the  average  expected  cost  over  (0,T-T  ) 

is  less  than  what  is  attainable  and  a contradication  results. 

Let  6 > 0 and  1 + 36  < 2r  < 2r  + 46  < 2 . We  shall 
use  the  fact  that  for  ^ £ t 1 s £ T , and  any  n > 0 

P{  sup  I W.  ( s)  -W  • (t)  I > kT(S+r/2}  = o(T-n)  as  T -*■  «> 
i,|  t-s  | <Tr  1 

Given  any  policy  (P  with  state  process  X(t)  , we  define 
a modified  version  (p T over  the  interval  (0,T)  which 
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follows  (?  until  Z(t)  ^ Tr  at  which  time  mode  2 is  applied 
for  a time  interval  of  length  -Tr/2y2  after  which  one  matches 
the  modes  which  would  have  been  used  had  (P  been  followed. 

As  soon  as  the  modified  queue  length  ZT(t)  21  Tr  once  again, 
one  repeats  mode  2 for  another  time  interval  length  -Tr/2p2  • 
Between  the  era  consisting  of  the  time  from  the  first  arrival  to 
Tr  and  the  next  arrival  of  ZT ( t ) to  0 , the  time  duration 


in  mode  2 has  been  increased  by  an  amount 


t because  of 


our  modification.  At  the  end  of  the  era  ZT(t)  <_  Z(t)  . 
Succeeding  eras  from  the  arrivals  to  T followed  by  the 
returns  to  0 involve  increased  durations  in  mode  2 of 

T 2 ' 1 3 ' ‘ 

The  modified  policy  may  lead  to  certain  increases  of  cost. 

1 -r 

That  due  to  additional  switching  is  0(T  ) since  there  are 

1 “"IT 

at  most  0 (T  ) additional  switches  6f  mode.  The  additional 
cost  due  to  the  difference  in  c^  is  0(T)  . The  difference 

It 

in  holding  cost  is  bounded  as  follows.  Let  t represent  the 
current  value  at  time  t of  the  increased  time  duration  in 
mode  2 in  the  current  or  i-th  era.  Then 


* fi,  /2  * 

ZT(t)  < Z(t)  + (\i2-u1)t  + T (kt  ■L/z+l)  for  0 < t < t..  , 
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_n  * *1/2 

with  probability  exceeding  l-o(T  ) . But  -k-^t  + k2t 

2 

attains  a maximum  value  of  k2  /4k^  and  hence  the  additional 
cost  due  to  the  difference  in  holding  cost  is  with  large 
probability  0(T^+^)  . 

None  of  these  extra  costs  are  incurred  if  Z(t)  < Tr 
for  0 <_  t < T . If  Z(t)  > Tr  for  some  t <_  T , then  the 
holding  cost  using  (P  is  very  likely  to  exceed  T which 

is  large  compared  with  the  possible  additional  expense  incurred 
by  using  (?T  . It  follows  that  the  expected  average  cost 
of  (?T  over  (0,T)  exceeds  that  of  (P  by  at  most  a 
relatively  small  amount. 

If  (P  is  not  optimal  then  there  is  an  infimum  y of 
the  expected  average  costs  over  long  time  intervals  where 
y < Yq  . Then  there  is  a sequence  of  times  and  policies 

Pi  so  that  the  tT^EC^  (x,  0,^)  -+  y < yQ  • In  that  case 

(5.4)  implies  that  lim  T^EVq  [X (T^)  ] ^ yQ  * Y and  hence 

lim  TixE[Zi  (Ti) ] > -2y2 (Yq“Y)/c0  since  vQ(x)  ^ -cQz  /2p2  as 

z -*■  ot>  . Moreover  the  same  inequality  applies  for  the  re- 
stricted policy  O3.  m . Now  let  T.  = T.  - T.I+I^  . With 
c J i,T^  ill 

very  large  probability 


ZT.(Ti"t>  - [Zi (Ti) “u1t_Ti6+r/2 1 


0 < t < T. 

— l 


r+rt 


X 

and  ZT  (T^)  < . Then  if  > 0 , the  holding  cost 
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over  (T^,T^)  exceeds  (2^)  (TV)  -rT^+r/'2  ] } and 

■*■  6+r/2  2 

if  = 0 , it  exceeds  [ZT  (T^)-T^  ' ] .It  follows 

i 

that  the  ezpected  holding  cost  over  (T^,T^)  is  a substantial 


multiple  of  I\  . But  then 


Tim  tTaECt  (x,0,Ti)  < y 


which  contradicts  our  definition  of  y 


Computation  of  Optimal  Solutions 


The  number  of  essentially  independent  parameters  for 
this  problem  is  so  great  that  it  is  unfeasible  to  tabulate 
the  solutions  even  in  the  linear  case.  It  is  preferable  to 
use  a numerical  method  for  computing  solutions  for  specific 
values  of  the  parameters.  There  are  many  numerical  approaches 
that  can  be  used  including  even  backward  induction.  Consistent 
with  the  general  analytic  approach  of  this  paper  are  several 
methods  which  apply  the  smoothness  condition  (5.3). 

For  example  in  the  case  of  the  linear  holding  cost,  one 
approach  that  was  used  successfully  is  described  below  for 
0 . If  one  anticipates  the  solution  of  the  form 
» [0,b)  and  C22  * (a,«*)  , then 
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When  the  switching  costs  Kl2  = K21  = ® ' this  solution  de- 
generates to  the  form  where  a = b = , v^(Zq)  = v2(Zg)  ' 

and  v^(zQ)  = v2^zo^  * These  last  two  equations  are  easily 
solved  for  y and  zQ  . As  K12  + K21  increases,  the 
optimal  y , a and  b change  monotonically  and  the  following 
four  equations 


v2 (a)  = v1(a)  + K21  , v1(b)  = v2(b)  + K12  , 


v£(a)  = v2 (a)  , vj(b)  = v£(b) 


involve  the  unknowns  y , a , b , and  two  constants  of  in- 
tegration, one  of  which  can  be  arbitrarily  set,  say  by  the 
equation  v^O)  = 0 . This  leaves  four  equations  in  four 
unknowns  which  can  be  solved  iteratively  by  Newton's  method  (using 
z0  for  an  initial  approximation  to  a and  b)  . One  may  check 

in  advance  to  see  whether  one  is  in  the  transient  case  by  checking  the 
inequalities  (6.3)  and  (6.4)  and  determining  b and  L for 

Y ■ p2  " c2  + x2  • 

An  alternative  approach  was  used  for  the  following  example 
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where  the  holding  cost  was  non-linear,  i.e., 
h(z)  = z + 2. 5 [ (z2+2z+2) e Z-2] . 

This  example  was  constructed  to  illustrate  a solution  of 

the  form  = (a^a^  , C22  = [Ojb^u  (fc>2 ,°°)  where 

a1  < < b2  < a2  . The  function  h(z)  is  always  positive 

but  after  a brief  rise  near  z = 0 , dips  close  to  zero 

at  z = 3.31  and  then  rises  again,  behaving  asymptotically 

like  z - 5 as  z » and  like  z near  z = 0 . With  the 

2 2 

parameters  y^  = - . 5 , y 2 = -1 , = .81  , °2  = .49, 

= .9  and  c2  = 1.0  and  K = K12  + K21  = .03  , some 
preliminary  calculations  suggested  that  for  z close  to 
zero  and  z close  to  <=°  , the  approximate  linearity  and 
the  choice  of  parameters  would  make  mode  2 preferable.  However 
for  z close  to  3 the  low  holding  cost  suggested  that  it  would 
be  desirable  to  let  z decrease  slowly  (i.e.,  to  use  mode  1). 

The  approach  used  was  to  select  an  approximation  to  a^ 
a2'^l,^>2  an<*  t0  comPute  the  potential  function  using  the  method 
described  at  the  end  of  Section  4.  Then  v|(z)  -v2(z)  was  evaluated 
at  a1,b1,b2  and  a2  . Then  a^,  b^,  b2  and  a2  were  changed 
to  reduce  |v^(z)  - v2(z) | at  these  four  points.  A positive 
value  of  v|(z)  - vjU)  at  z = a^^  and  z = u2  and 
a negative  value  at  z * b^  and  z = b2  suggests  in- 
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creasing  these  values  of  a.^  and  . The  value  of  v-[(z)  -v^z) 

at  each  of  these  points  depends  mainly  on  that  point  and 
after  starting  gingerly  with  small  changes  of  a^,  , 

in  the  appropriate  directions,  subsequent  appropriate 
changes  leads  to  rapid  convergence  to  a^  = .7786,  b.^  = 1.497  3, 

b^  = 3.6673  and  b2  = 4.5185  . The  resulting  value  of 
Y = 1.206496  is  a rather  slight  improvement  in  effect  over 
the  value  y = 1.206736  for  our  choice  of  the  initial 
approximation  a^^  = 1 , bx  = 2 , b2  = 3 , a2  = 4 . 

Additional  computation  confirmed  that  the  optimality 
conditions  (5.1)  and  (5.2)  are  satisfied. 
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